Rank | Count | Beginning |
---|---|---|
16875 | 1497 | Ol |
4457 | 1492 | Bu |
19678 | 649 | Onuň |
23408 | 463 | Şol |
17074 | 462 | Olar |
7799 | 460 | Emma |
29069 | 357 | Ýöne |
17673 | 280 | Olaryň |
24500 | 264 | Şonuň |
22773 | 220 | Şeyle |
24284 | 201 | Soňra |
3567 | 187 | Beýik |
26575 | 171 | Türkmenistanyň |
26326 | 159 | Türkmen |
25066 | 155 | Şu |
3298 | 144 | B.e.öň |
22287 | 142 | Sebäbi |
29569 | 133 | Ýurduň |
24063 | 121 | Şondan |
11272 | 112 | Hazirki |
11438 | 110 | Her |
19312 | 104 | Oňa |
22986 | 102 | Şeýlelikde |
25766 | 97 | Täze |
16144 | 96 | Mysal |
7497 | 92 | Eger |
4187 | 90 | Biziň |
10872 | 86 | Halk |
19480 | 83 | Ondan |
20336 | 82 | Ony |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV